Universal tree structures in directed polymers and models of evolving populations 
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By measuring or calculating coalescence times for several models of coalescence or evolution, with 
and without selection, we show that the ratios of these coalescence times become universal in the 
large size limit and we identify a few universality classes. 



Random trees appear in many contexts in biology, 
mathematics and physics. In evolutionary biology, they 
represent the genealogies of reproducing populations. In 
physics, random trees appear in many systems such as 
DLA (diffusion limited aggregation) pQ, coarsening, river 
networks [3J [3J, diagrams in perturbation theory, ultra- 
metric structure of pure states in mean field spin glasses 
[U [5J, directed polymers in a random medium jHJ [7], 
shocks in one-dimensional turbulence [8j|9j[10], etc. 

From a mathematical point of view, one of the sim- 
plest examples of random trees is Kingman's coalescent 
[TTl IT2"] : it describes the coalescence tree of particles, 
where each pair of particles has a probability St of co- 
alescing into a single particle during every infinitesimal 
time interval St. The random tree structures of King- 
man's coalescent are identical to the genealogies obtained 
in simple mean field models of neutral evolution such as 
the Wright-Fisher model [13j[T4j. In such models, each 
individual of a population of fixed size N at a given gener- 
ation gives birth to a random number of offspring and the 
population at the next generation is obtained by choosing 
N survivors at random among all these offspring. If one 
follows the evolution over a large enough number of gen- 
erations for the initial condition to be forgotten, a steady 
state is reached where the statistics of the genealogical 
tree of a large population are identical to those of King- 
man's coalescent. 

Other random trees have been considered in the math- 
ematical literature, such as the A-coalescents [151 IT51 IT7] . 
which generalize Kingman's coalescent and describe a 
wider class of mean-field coalescence models [18]. In the 
A-coalescent, each subset of k particles among n parti- 
cles has a probability X n ,kSt of coalescing into a single 
particle during an infinitesimal time St. As a set of n 
particles can be considered as a subset of a larger set of 
n + l particles, the rates A„jt have to satisfy some con- 
sistency relations: the coalescence of k particles in the 
subset of size n happens in two cases : either these k 
particles coalesce in the set of size n+l (rate A n +i,k) or 
they coalesce together with the (n + l)-th particle (rate 
A„+i,fc+i). Therefore 

A n ,fc = ^n+l,k + Ki+l,k+l- (1) 
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This recursion leads to the following general expression 
for the coalescence rates [T5J [T5] : 

\n,k = f x k ~ 2 {\ - x) n - k A(x) dx, (2) 
Jo 

where A is some positive measure on the interval [0, 1]. 
With these notations, Kingman's coalescent corresponds 
to A(x) — S(x). Another particular case, which has been 
studied, in the context of spin glasses, is the Bolthausen- 
Sznitman coalescent [19] for which A(x) = 1. Trees in the 
Kingman's coalescent and in the Bolthausen-Sznitman 
coalescent have different statistical properties. 

In order to compare different models of physical or bi- 
ological systems which generate random trees and to try 
to identify universality classes, we consider here simple 
quantities characteristic of these random tree structures. 
For a tree with a large number of end points, we define 
T p as the distance one has to go up into the tree to find 
the most recent common ancestor of p given points (see 
Fig.0. 

For models of evolving populations, the distance T p is 
the age of the most recent common ancestor of p indi- 
viduals chosen at random in the population. In general, 
it depends both on the generation at which these p in- 
dividuals live, but also on the choice of the p individ- 
uals, even in the limit of very large trees. This double 
source of fluctuations for the T p is reminiscent of what 
happens in mean field spin glasses [5]: as for the over- 
laps in Parisi's theory, the distribution of the T p remains 
broad even when the size of the population becomes very 
large[5jm5]. 




FIG. 1: The times T p are the ages of the most common an- 
cestors of p individuals chosen at random. 

For a given model, one can try to determine averages 
(T p ) or moments ((T p ) k ) of these times T p (the averages 
are taken over all the branches of the tree, i. e. over all the 
population at a given generation, and over all the ran- 
dom trees, i.e. over all the generations in the language 
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of models of evolution) . In recent works [UJ [22] , it was 
noticed that for a large class of mean field models of evo- 
lution with selection, the ratios of these average times 
(T p ) take, for a large population, simple universal val- 
ues indicating that the genealogical trees are distributed 
according to the statistics of the Bolthausen-Sznitman 
coalescent. Theferefore, at the mean field level and for a 
large size of the population, two universality classes seem 
to emerge for models of evolution: Kingman's trees in the 
case of neutral evolution for which 
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and Bolthauscn-Sznitman's trees in the case of selection: 
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The goal of the present work is to try to measure these 
coalescence ratios for other models of evolution, in par- 
ticular to analyse the effect of spatial fluctuations, and 
to argue that directed polymers in a random medium are 
in the same universality classes as evolution models in 
presence of selection. 

The paper is organized as follows. In section [TJ we con- 
sider, at the mean field level or in finite dimension, coa- 
lescence models which are equivalent, as we will see, to 
neutral models of evolution. Above two dimensions of 
space, the coalescence trees have the same statistics [23] 
as in mean field with coalescence times given by Eq. d3J) , 
whereas in one dimension, they lead to a different uni- 
versality class for which we compute the ratios of coa- 
lescence times. In section [IT] we consider the trees of 
optimal paths in the problem of directed polymers in a 
random medium. Our numerical results will show that 
at the mean field level, the trees satisfy the Bolthausen- 
Sznitman statistics Eq. Q, whereas the ratios of coa- 
lescence times vary with dimension as expected by the 
known universality classes of the problem. 



COALESCENCE AND MODELS OF 
NEUTRAL EVOLUTION 

A. Kingman's coalescent 



Kingman's coalescent [TT] [12] is a mean field model of 
coalescing particles: during each infinitesimal time inter- 
val St every pair of particles has a probability of coalesc- 
ing into a single particle. Therefore if one starts with 
p particles, there is a random waiting time r p until a 
coalescence event occurs when these p particles become 
p — 1 particles. Then there is another random time t p _i 
until a pair among these p — 1 particles coalesce (and 
one is left with p — 2 particles), and so on. The times 
are independent and distributed according to exponential 
distributions 
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and the time T p for p particles chosen at random to coa- 
lesce is given by: 



T p = t p + t p -i H h r 3 + t 2 . 



(6) 



This allows one to recover easily the values of Eq. ^ . In 
fact the whole generating functions of the times T p can 
be calculated: 
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In particular one can notice that the time T 2 has an ex- 
ponential distribution. 



B. Wright-Fisher model 



The Wright-Fisher model [13, 14 is one of the simplest 
neutral models of an evolving population. It describes a 
population of constant size N with non-overlapping gen- 
erations and asexual reproduction. At each generation, 
all the population is replaced by N new individuals with 
the following rule: each individual at a given generation 
has its parent randomly chosen among the N individu- 
als at the previous generation. If one goes backward in 
times, the lineage of an individual performs a random 
walk on a fully connected graph of N sites. Following 
the lineages of p individuals is the same as following p 
coalescing random walks on this fully connected graph. 
Since the random walks are independent, the statistics of 
coalescence times can be easily calculated [TTJ [TJ] : for p 
individuals chosen at random at generation g, the time T p 
is the age of their most recent common ancestor, i.e. T p is 
the number of time steps for the p random walkers on the 
fully connected graph to coalesce. At each generation in 
the past, two distinct lineages have a probability l/N of 
merging, thus T 2 scales as the size N of the population. 
For fixed p > 2, the probability that a pair of lineages 
coalesce is l/N whereas multiple coalescences occur with 
higher powers of l/N for large N. One can then neglect 
these multiple coalescences and 



T p ~ N{t p + t p _i + ■ • • + r 3 + r 2 ), 



(8) 



where the times tj, are distributed according to Eq. (|5|, 
implying that the statistics of the times Tp are exactly 
the same as in Kingman's coalescent Eq. ([3J). 



C. Coalescing random walks in finite dimension 

We are now going to look at coalescing random walks 
on an hypercube of N = L d sites in dimension d with 
periodic boundary conditions. We consider the continu- 
ous time case, where during infinitesimal time interval St, 
each walker on the hypercube has a probability St of hop- 
ping to each of its neighboring sites, and whenever two 
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walkers occupy the same site, they instantaneously coa- 
lesce into a single walker. If T 2 (r) is the coalescence time 
between two walkers at a distance r apart, its evolution 
is 



T 2 (r) = 



St + T 2 (r) with probability l-4dSt, 

St + T 2 (f+ el) with probability 2St, 



(9) 

where el is one of the 2d unit vectors on the hypercubic 
lattice. 

It is clear that the distance between the two walkers 
performs a random walk and that T 2 is simply the first 
time that this distance vanishes. This is of course a very 
well known first passage problem [2H US] which can be 
solved easily (it reduces to the inversion of a Laplacian): 
the generating function of T 2 (f) satisfies for 8t <C 1 and 
for f ^ 



For large L, it is well known 



that Eq. ( 15 ) gives 
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On the other hand, one ca n sh ow that the second term in 
the right-hand-side of Eq. ( 16 I grows as L d in dimension 



d > 4 and as L in dimension d < 4. Therefore for d> 2, 
the ratio (T 2 ) / (T 2 ) 2 goes to 2 when L — > 00, as in the 
mean field case Eq. ([3]). 

In fact it has been proved [53] that in d > 2 (and 
for large L) the whole genealogies of p individuals (aver- 
aged over all their positions) are given by the Kingman 
coalescent, up to the rescaling (17). In particular the 



distribution of the time T 2 is exponential. 
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where (•) denotes an average over all the random walks. 
At f = 0, it satisfies the boundary condition 
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D. Coalescing random walks in one dimension 

In dimen sion d < 2, the two terms in the right-hand- 
side of Eq. ( 16 ) are comparable, and the ratio (T 2 ) / (T 2 ) 2 
no longer converges to 2. 

In dimension d — 1 the calculation of all the moments 
of the times T p is rather straightforward. First one can 
easily solve Eq. ( 12 1 for periodic boundary conditions 
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and this can be easily solved in Fourier space to give 
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where the constant A(X) is fixed by the condition of 

Eq. m. 



Starting with two particles at random positions on the 
lattice and averaging over these two positions leads to 
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For large L, this becomes a scaling function of XL 2 and 
of r/L 
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and, averaging over r, one gets 
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which shows that the distribution of T 2 is no longer ex- 
ponential. 

One can write down the equations satisfied by the 
generating functions of the times T p . For large L and 
A = 0(L~ 2 ) the solution is 
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where the are the distances between consecutive par- 
ticles along the ring (one has of course r% + ■ ■ ■ + r p = L). 
In particular, for p — 2, r\ = r and r 2 — L — r, one recov- 



ers Eq. (19 1. Averaging Eq. (21) over all the positions of 
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the p particles on the ring leads to 
(e XT -)~p{p~l) 







From Eq. ( 22 ) one can then obtain all the moments of 
(T p ). For example, one has 
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in contrast with Eq. ^ and Eq. Q. 

One could repeat the calculations which lead to 
Eqs. ( [l5| [T6] [l7| and Eq. (24 1 for models of coalescence 
on other lattices or with more general jumping rates. As 
long as the motion of the coalescing particles remains 
diffusive, one would recover the same values Eq. ([3| or 
Eq. (24) for the statistics of the trees. 



E. Neutral evolution in finite dimension 

One can try to generalize the Wright-Fisher model to 
the finite dimensional case, for example by considering an 
hypercube with a finite population of fixed size m on each 
lattice site, and the case where each individual chooses 
its parent in the previous generation with a probability p 
on the same lattice site and with probability 1 — p on one 
of the neighboring sites. The study of the genealogies in 
this case is obviously the same problem as following the 
coalescences of the lineages which perform random walks 
on this lattice. Therefore in dimension d = 2 and above, 
the trees are given by the statistics Eq. ^ of Kingman's 
coalescent whereas in dimension d = 1 they will be in the 



universality class Eq. (24 1 of coalescing random walks in 
one dimension. 



II. DIRECTED POLYMERS IN A RANDOM 
MEDIUM 




FIG. 2: (Left) a directed polymer in dimension 1 + 1. The 
"time" direction is vertical. (Right) A directed polymer arriv- 
ing at A comes either from B or from C, whichever is more 
energetically favorable: In the example shown, the coales- 
cence time of the directed polymers arriving at B and C is 
four. 



and Fig. [3]) with some random excursions in the d other 
transverse directions (see Fig. [2]). 

We consider here directed polymers on a lattice which 
is infinite in the "time" direction but finite and periodic 
in the d transverse directions. In each time section, there 
are N = L d sites located on a d-dimensional hypercube 
of linear size L with periodic boundary conditions. Each 
site in a given time section is connected to M = 2 d sites 
in the previous time section (and it is also connected 
to M other sites in the next time section). The way 
each site is connected is shown for dimension 1 + 1 in 
Fig. [2j In higher dimension, we generalized the lattice 
of Fig. [2] in the following way: let x — (x\, x 2 , ■ ■ ■ , Xd) 
be the transverse coordinates of a given site; the x% are 
integers at even times and half-integers at odd times, and 
the M = 2 d potential parent sites of x have coordinates 
(xi ± 1/2, x 2 ± 1/2, . . .,Xd ± 1/2) in the previous time 
section. 

We consider also a mean-field version (Fig. [3j, where 
there is no spatial structure in the transverse directions: 
A time section consists of a set of N sites, and each of 
them is connected to M sites chosen at random among 
the N sites of the previous time section, where M might 
be any number between 2 and N. 



Directed polymers in a random medium is one of 
the simplest examples of a strongly disordered system 
[3 E3 EH ES] . It describes directed paths in a random 
energy landscape. In its zero temperature version, the 
problem reduces to finding the optimal path, i.e. the path 
of minimal energy in this random energy landscape. The 
optimal paths starting at the same point but arriving at 
different points give rise to a tree structure, that we try to 
characterize in this section by measuring the coalescence 
times T p . 

A directed polymer in dimension d + 1 is a line extend- 
ing in one of the directions (traditionnaly called "time" , 
and which we represent as the vertical direction in Fig. [2] 




FIG. 3: Directed polymer in mean-field. At a given time, each 
of the N — 4 sites is connected to M = 2 random sites at the 
previous time. 

We assume that each link (AB) between two connected 
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sites A and B carries a random energy C(ab)- The energy 
E of the polymer is then the sum of all the energies £(ab) 
of the visited links. 

We choose an origin where the polymer starts, and for 
any given site A on the lattice, we call Ea the minimal 
energy of the polymer over all the possible directed paths 
connecting this origin to A. At zero temperature, the 
directed polymer chooses the path which minimizes its 
energy and one has the simple recursion relation 



(25) 



E A = min ( E B + e ( AB) , E c + £(ac) , 



where B, C, ... are the M potential parent sites of site 
A. 

For any pair of sites A and A' in the same time sec- 
tion, we define their coalescence time (see Fig. [2| as the 
number of up steps during which the two optimal paths 
arriving at A and A 1 differ (we suppose that the origin 
of the directed polymers is at a remote enough time in 
the past for the paths to coalesce). In a similar way, we 
define the coalescence times of any group of p different 
sites as the maximal coalescence time of any pair within 
the p sites. All these quantities depend on the chosen 
sites and on the realization of the disorder, and, as in 
the previous section, we note by (•) the average over the 
choice of sites and the disorder. In this section, we con- 
sider the averaged coalesence time (T p ) and the averaged 
square of the coalesence time (T£) of p sites. 

We have simulated four models in dimension 1 + 1; 
from top to bottom on Fig. |4j 

• on the lattice of Fig. [2] with a discrete distribution 
of e with values e = or e = 1 with probabilities 
1/2, 

• on the lattice of Fig. [2] with a uniform distribution 
of e in [0,1], 

• on the lattice of Fig. [5J with negative values of e, 
distributed according to p(e) — e e 6{— e) 

• on a square lattice where each site is connected to 
M = 3 parents (just above itself, on its right and on 
its left) where e takes positive values, with an expo- 
nentially decreasing distribution: p(e) = e~ c #(+e). 

In dimension 2 + 1 , we have simulated three models all on 
the lattice with M — 4 ancestors described above; from 
top to bottom on Fig. |4j 

• with an exponentially increasing distribution: 
p(e) = e+ e 0(-e) 

• with a uniform distribution of e in [0,1], 

• with an exponentially decreasing distribution: 

Finally, we have simulated two models in mean-field with 
a uniform distribution of e in [0, 1] and either M = 2 or 
M = 4 random ancestors for each site (M = 2 is above 



M — 4 in Fig. [4]). Our data for all these models are plot- 
ted together with the same symbol for each dimension to 
emphasize the universality of our results. 

To measure the T p 's, the conceptualy simplest way 
is to update a TV x TV matrix containing for all 
pairs (i,j) of individuals the time T 2 (i,j) of their 
most commun ancestor. Indeed, for an arbitrary 
number p of individuals, one has T p (ix, . . . ,i p ) = 
ma,x[T 2 (ii,i 2 ),T 2 (ii,i 3 ), . . . : T 2 (i l7 i p )], so that the ma- 
trix of the T 2 's contains all the relevant information. Up- 
dating this matrix at each time step is easy: The T 2 of 
two different sites is one plus the T 2 of their parents, and 
the T 2 of a site with itself is zero. Because updating 
at each time step a TV x TV matrix is time consuming, 
we used a more sophisticated method [21 where we keep 
track of the genealogical tree of all the sites at a current 
time: there are of course TV sites at the current time, and 
at most TV — 1 nodes, where a node is a site from previous 
times which is the most recent common ancestor of two 
sites at the current time. At each time step, updating 
the whole tree takes a time linear in TV, and averaging 
the T p over all the choices of p individuals takes also a 
time linear in TV, as one simply has to recursively walk 
down the tree from its root and count for each node the 
number of times it is the most recent commun ancestor 
of p sites in the current time. This algorithm is described 
in more details in [21 . 

For each data point, we have run one long simulation 
and averaged our results over all the time steps once the 
steady state was reached. This is equivalent to averaging 
over many independent realizations if we run a simulation 
for a time much longer than the correlation time, which 
we estimated to be of the order of magnitude of (T 2 ) . All 
of our simulations were at least 20 000 times longer than 
(T 2 ). 
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FIG. 4: Averaged coalescence time (T2} of two individuals 
for several models of directed polymers in dimensions 1 + 1 
and 2 + 1, and in mean-field, as a function of the number 
TV of sites in each time section. The data are compared to 
the prediction (T2) oc jV 1 /^"' in dotted lines for dimensions 
1 + 1 and 2 + 1. Note that, by chance, two out of the four 
models in dimension 1 + 1 and two out of the three models 
in dimension 2 + 1 have nearly the same prefactor and their 
data are undistinguishable. 

In Fig. [4j we plot the coalescence time (T 2 ) as a func- 
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tion of the system size. For directed polymers on a lattice 
which is infinite both in the time direction and in the d 
transverse directions, the transverse displacement of the 
optimal path scales like t v , where t is the length of the 
directed polymer and v is a universal exponent [7] equal 
to = 2/3 in dimension 1 + 1 and z^+i ~ 0.624 in di- 
mension 2 + 1 . In our setup, with a lattice finite of linear 
size L in the transverse directions, this scaling can only 
hold as long as t < T corr with T" OTt = L = N l l d . This 
time T corr is the correlation time on the scale of which 
the system forgets its initial condition. Moreover, if we 
consider several sites and the optimal paths arriving at 
these sites, these paths coalesce on a time scale of the 
order of T corr , as can be seen on Fig. [3] 

In mean-field with a finite number M of potential 
ancestors per site, there is no notion of distance in 
the transverse directions, and the exponent v is mean- 
ingless. We therefore expect a different scaling. The 
problem of zero-temperature mean-field directed poly- 
mers can be formulated [30 as a noisy Fisher-KPP like 
equation [3TJ [35]. Recently, a phenomenological the- 
ory of coalescence trees in models of Fisher-KPP fronts 
suggested [211 1331 131] that the coalescence time in such 
models should be of order T corr cx (IniV) 3 . On Fig. [I] 
one can see that the data seem to have a slower growth 
than a power law, but the values of N we simulated here 
are too small to check the (In iV) 3 prediction. Better sim- 
ulations on a closely related model are presented in [5T] 
where the (In TV) 3 scaling appears clearly. 
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FIG. 5: Ratios of coalescence times for directed polymers at 
zero temperature as a function of the size N of the system 
in, from top to bottom, dimension 1 + 1, dimension 2 + 1 
and mean-field. The dotted line represents the prediction 
Eq. Q for mean-field in the limit of infinite size. (Left) ratios 
<T 3 )AT2). (Right) ratios (T 4 }/(T 2 >. 

We now turn to the ratios of coalescence times. Fig. [5] 
shows the ratios (T 3 ) / (T 2 ) and (T 4 ) /(T 2 ) as a function of 
the system size for all the models we study (four models 
in dimension 1 + 1, three in dimension 2+1 and two 
in mean-field). Numerically, for large N, these ratios 
seem to depend only on the dimension, and not on the 
distribution p(e) of the bond energies, nor on the shape 
of the lattice. The results in mean-field are compatible 
with the prediction that for an infinitely large system in 
the Fisher-KPP front equation class [5T], the genealogical 
tree converges to a Bolthausen-Sznitman coalescent, with 



ratios given by Eq. Q. In dimensions 1 + 1 and 2 + 1, 
our numerical results indicate clearly that we have tree 
statistics different from the Bolthausen-Sznitman coales- 
cent, and also different from the Kingman coalescent for 
which (T 3 )/(T 2 ) would be 3/2 as in Eq. 
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FIG. 6: Ratios of moments of the coalescence times for di- 
rected polymers at zero temperature as a function of the size 
N of the system in, from top to bottom, dimension 1 + 1, 
dimension 2 + 1 and mean- field. The dotted line represents 
the prediction Eq. Q for mean-field in the limit of infinite 
size. (Left) ratios (Tf)/(T 2 ) 2 . (Right) ratios (T 3 2 ) / '{T 2 } 2 . 

On Fig. [6j we show the ratios (Tf) / '(T 2 ) 2 and 
(T 3 }/(T 2 ) 2 . Here, the situation is less clear: the symbols 
for the different models do not superpose and the ratios 
do not seem to have converged (in particular, the mean- 
field ratios are rather far from the prediction Eq. Q). 
For some reason we do not understand, it seems that the 
(Tp)/(T 2 ) 2 need much larger values of N to converge to 
their final values than the (T P )/(T2). We already ob- 
served a similar phenomenon on an exactly solvable re- 
lated model [21]. 

We also measured the ratios (T N )/(T 2 ), where T N is 
the age of the most recent common ancestor of the whole 
population and found these ratios to be close to 1.93 in 
dimensions 1 + 1 and 2+1, while it diverges in mean-field. 



A. Long tail distributions 

In the directed polymer problem, it is known that the 
scaling regime is modified when the distribution p(e) of 
the energies of the bonds decays as a power law p(e) oc 
|e|~ a for large negative e: when a < a c with a c ~ 7, the 
directed polymer in dimension 1 + 1 has an anomalous 
scaling [35] and the exponent v depends on a. We 
have measured the coalescence times in dimension 1 + 1 
for a distribution of energies given by 



(l + |e|)" : 



(26) 



with a > 1 and A(a) such that p{e) is normalized, for 
sizes N = 100 and N = 400. The ratios (T 3 )/(T 2 ) and 
{T±)/{T 2 ) are presented in Fig. [7] We observe that, for 
large a, these ratios converge towards the universal values 
shown on Fig.[5j while for a — > 1 + , they seem to converge 
close to, respectively, 1.24 and 1.35. 
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FIG. 7: Ratios (T 3 )/(T 2 ) and (T A ) / (T 2 ) as a function of the 
exponent a appearing in the noise Eq. (26 1, for two different 
system sizes in dimension 1 + 1. 



As we expect (T2) to scale like N 1 ^^, it is possible 
to obtain a rough estimate of the exponent v(ot) from the 
only two datapoints at sizes N — 100 and N = 400. This 
estimate is shown on Fig. [8] The exponent v(a) seems to 
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FIG. 8: Estimate of the exponent v of the directed polymer 
as a function of t he exponent a appearing in the distribu- 
tion of e Eq. ( 26 1 . This exponent has been evaluated from 
the formula ln(4)/ln [{T 2 (n =400)) / {T 2 (n =100))] . The univer- 
sal value = 2/3 for distributions decaying fast enough is 
also shown. 

converge toward the universal value v\ + \ — 2/3 for large 
a, while it seems to be 1 for 1 < a < 2. 

As with previous numerical studies |7J, our results are 
not precise enough to determine precisely the critical a c 
above which v = 2/3. 



B. Discrete distributions 

We are now going to discuss the case where the energies 
of the bonds take discrete values. In this case, it may 



happen in Eq. ( 25 1 that there are several paths coming 
from different potential parent sites in the previous time 
section with the same minimal energy and the question 
is, of course, which path should be selected as the parent 
site. 



The simplest idea is to choose randomly at each time 
step with equal probabilities one of the paths with the 
lowest energy. With this procedure, we have run numer- 
ical simulations in dimension 1 + 1 for several sizes with 
a binary noise for the energies e of the bonds, 



with probability p, 

1 with probability 1 - 



P, 



(27) 



for several values of p. Our results for the ratio (T 3 ) / (T 2 ) 
as a function of p are shown in Fig. [9] as dotted lines. 
As p varies, we observe a crossover between two values: 
for small p, (T3) / (T2) ~ 1.36 as for directed polymers 
in dimension 1 + 1 when the distribution of energies is 
continuous (see Fig. |5| and, for large p, (T 3 )/(T 2 ) ~ 1.4 
which corresponds to the coalescence of random walks 
in dimension 1, as in Eq. (24). The crossover between 
the two regions becomes sharper as L increases, which 
suggests a phase transition. The critical value of p is very 
consistent with the known threshold 0.6447 for directed 
percolation on the same lattice[35]. Thus, the system 
behaves like the neutral model when the e = bonds 
percolate. 
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FIG. 9: Ratios {T^)/{T2) as a function of p for the distribu- 
tion of e of Eq. \27\ . The dashed lines correspond to the sim- 
plest procedure of choosing with equal probabilities one of the 
potential parent sites realizing the minimal energy, and the 
plain lines represent the results using the weights Q which cor- 
responds to the T — » + limit of finite temperature directed 
polymers. The vertical dotted line indicates the directed per- 
colation threshold on the same lattice. 

Instead of choosing with equal probabilities which 
bond the polymer follows when they are energetically 
equivalent, there is an alternative procedure which cor- 
responds to taking the limit T — > + in the problem of 
directed polymers at a finite temperature T. At finite 
temperature, we keep track for each site A of the parti- 
tion function Za of a polymer arriving on A. Assuming 
that the site A has M — 2 potential parent sites B and 
C, we have the recursion Eq. (251 



Za = Za*- 



Za. 



(28) 



where Za^b = Zg exp(—/3e(AB)) is the partition func- 
tion of a directed polymer arriving on A via the site B 
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and where (3 — l/T. The probability that a polymer 
reaching A comes from B is given by 

Z A B 

(Prob. the polymer comes from B) = — — — . (29) 

Za 

At very low temperature, the partition function is domi- 
nated by the lowest energy paths: 



Z a ne-? E , 



(30) 



where E is the minimal energy and f2 the number of ways 
that this energy E can be obtained, so that Eq. (28) 
reads, at low temperature, 



(31) 



where Ea^b = Eg + £(ab) is the minimal energy of 
the path arriving at A through B. If Ea^b < Ea^c, 
then the first term in the right hand side of Eq. (31) 
dominates and we obtain Ea = Ea^b and £Ia — ^b- 



Furthermore, from Eq. ( 29 ) , the chosen path comes from 
B. On the other hand, if Ea^b = Ea^Ci both terms in 



Eq. ( 31 ) have the same order of magnitude and we obtain 
Ea = Ea^b = Ea^c an d &a — ^b + Then, from 



Eq. (29 1, the probability that the directed polymer comes 



from B is Q<b/Qa- 

In this way, we not only choose the optimal energy 
but we also keep track of entropy effects. We have run 
numerical simulations with the same parameters as above 
but with this new procedure. The ratios (T3) / (T 2 ) are 
shown on Fig.[9]in plain lines. For small values of p, both 
procedures yield the same results. For larger p, however, 
the difference is striking, and the phase transition seems 
to have disappeared: on both sides of the percolation 
threshold the data seem to be in the same universality 
class (as they converge to » 1.36). 



III. CONCLUSION 

In this paper, we have presented analytical and numer- 
ical results showing the existence of universality classes 
in the tree structures which appear in several models of 
evolution and in directed polymers (see Tab. [I] for a sum- 
mary) . 

Without selection, the genealogies of neutral models 
like the Wright-Fisher model or coalescing random walks 
are described above the critical dimension d c = 2 by the 
Kingman coalescent. For d = 1 the universality class 
is different: we have obtained the distribution Eq. (22) 



of the ages T p of the most recent common ancestor of p 
individuals. 



For directed polymers in a random medium, the same 
coalescence times T p have been measured numerically. 
In the mean field case, their values are compatible 
with Bolthausen-Sznitman's coalescent, which is already 
known to appear in spin glasses |19j and in branching 
random walks with a selection mechanism keeping the 
size constant [5T] [53] . In low dimension (at least d = 1 
and d = 2), the coalescence times belong to different uni- 
versality classes. It would be interesting to predict ana- 
lytically the numerical values of (T 3 )/(T 2 ) and (T 4 )/(T 2 ) 
measured in Fig. [5] for fast decaying distributions of e as 
well as the ones obtained in Fig.JjJfor power-law distribu- 
tions of e with exponent a ~ 1 + . In the mean- field case, 
it would also be interesting to know if the replica method 
can be used in order to determine the coalescence times. 

The simulations presented in this paper deal only with 
directed polymers at T = 0. Directed polymers ex- 
hibit a phase transition for d > 2 as the temperature 
increases [37]. We expect the tree statistics to change at 
T c from the universality class of directed polymers at 
zero temperature to the universality class of coalescing 
random walks. 

The construction of the minimal energy path for di- 
rected polymers can be related to spatial models in pres- 
ence of selection. In population dynamics, selection can 
be taken into account through a parameter, called the fit- 
ness or the adaptability, which characterizes the ability of 
an individual to survive and reproduce [351 1351 1301 |4"T1 |4"2"] . 
Individuals with a higher fitness have a higher probability 
of having a descendance. This parameter is transmitted 
from parents to offspring up to fluctuations due to mu- 
tations. An analogy can be drawn between the minimal 
energy of a directed polymer arriving on a site, and minus 
the fitness of an individual living on a site. In presence of 
local selection, a spatial model of population could there- 
fore be formulated as follows: on each site there would be 
one (or a finite number m of individuals); at each gen- 
eration, each individual would branch into k offspring 
with mutated fitnesses. These offspring diffuse and, un- 
der the effect of selection, only the best (or the m best) 
individual(s) on each site would be kept. Because of the 
similarity of such spatial models of population dynamics 
in presence of selection with the directed polymers, we 
expect these models to belong to the same universality 
classes. 

We performed preliminary simulations on such a spa- 
tial model of evolution with selection in dimension 1 + 1 
with m = 5 individuals per site. Our results for the ratios 
(T 3 )/{T 2 ) and (T 4 )/(T 2 ) coincide with those of directed 
polymers. 
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TABLE I: Universal ratios and order of magnitudes of coalescence times for models of evolution with and without selection, and 
directed polymers in a random medium. We could not reach large enough system sizes to give a reliable numerical prediction 
for the ratios (T 2 ) / (T p ) 2 . 
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